Reverse engineering chemical structures from molecular descriptors: how many solutions?
نویسندگان
چکیده
Physical, chemical and biological properties are the ultimate information of interest for chemical compounds. Molecular descriptors that map structural information to activities and properties are obvious candidates for information sharing. In this paper, we consider the feasibility of using molecular descriptors to safely exchange chemical information in such a way that the original chemical structures cannot be reverse engineered. To investigate the safety of sharing such descriptors, we compute the degeneracy (the number of structure matching a descriptor value) of several 2D descriptors, and use various methods to search for and reverse engineer structures. We examine degeneracy in the entire chemical space taking descriptors values from the alkane isomer series and the PubChem database. We further use a stochastic search to retrieve structures matching specific topological index values. Finally, we investigate the safety of exchanging of fragmental descriptors using deterministic enumeration.
منابع مشابه
Descriptor collision and confusion: Toward the design of descriptors to mask chemical structures
We examined "descriptor collision" for several chemical fingerprint systems (MDL 320, Daylight, SMDL), and for a 2D-based descriptor set. For large databases (ChemNavigator and WOMBAT), the smallest collision rate remains around 5%. We systematically increase the "descriptor collision" rate (here termed "descriptor confusion"), in order to design a set of "descriptors to mask chemical structure...
متن کاملIn-silico prediction of Cellular Responses to Polymeric Biomaterials from Their Molecular Descriptors
In this work quantitative structure activity relationship (QSAR) methodology was applied for modeling and prediction of cellular response to polymers that have been designed for tissue engineering. After calculation and screening of molecular descriptors, linear and nonlinear models were developed by using multiple linear regressions (MLR) and artificial neural network (ANN) methods. The root m...
متن کاملSurrogate data - a secure way to share corporate data
The privacy of chemical structure is of paramount importance for the industrial sector, in particular for the pharmaceutical industry. At the same time, companies handle large amounts of physico-chemical and biological data that could be shared in order to improve our molecular understanding of pharmacokinetic and toxicological properties, which could lead to improved predictivity and shorten t...
متن کاملQSAR models to predict physico-chemical Properties of some barbiturate derivatives using molecular descriptors and genetic algorithm- multiple linear regressions
In this study the relationship between choosing appropriate descriptors by genetic algorithm to the Polarizability (POL), Molar Refractivity (MR) and Octanol/water Partition Coefficient (LogP) of barbiturates is studied. The chemical structures of the molecules were optimized using ab initio 6-31G basis set method and Polak-Ribiere algorithm with conjugated gradient within HyperChem 8.0 environ...
متن کاملIdentification of Groupings of Graph Theoretical Molecular Descriptors Using a Hybrid Cluster Analysis Approach
There is an abundance of structural molecular descriptors of various forms that have been proposed and tested over the years. Very often different descriptors represent, more or less, the same aspects of molecular structures and, thus, they have diminished discriminating power for the identification of different structural features that might contribute to the molecular property, or activity of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of computer-aided molecular design
دوره 19 9-10 شماره
صفحات -
تاریخ انتشار 2005